Improving Children's Speech Recognition by HMM Interpolation with an Adults' Speech Recognizer

نویسندگان

  • Stefan Steidl
  • Georg Stemmer
  • Christian Hacker
  • Elmar Nöth
  • Heinrich Niemann
چکیده

In this paper we address the problem of building a good speech recognizer if there is only a small amount of training data available. The acoustic models can be improved by interpolation with the well-trained models of a second recognizer from a different application scenario. In our case, we interpolate a children’s speech recognizer with a recognizer for adults’ speech. Each hidden Markov model has its own set of interpolation partners; experiments were conducted with up to 50 partners. The interpolation weights are estimated automatically on a validation set using the EM algorithm. The word accuracy of the children’s speech recognizer could be improved from 74.6% to 81.5%. This is a relative improvement of almost 10%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

Presentation of K Nearest Neighbor Gaussian Interpolation and comparing it with Fuzzy Interpolation in Speech Recognition

Hidden Markov Model is a popular statisical method that is used in continious and discrete speech recognition. The probability density function of observation vectors in each state is estimated with discrete density or continious density modeling. The performance (in correct word recognition rate) of continious density is higher than discrete density HMM, but its computation complexity is very ...

متن کامل

HMM-Based Recognition and Adaptation of Persian Children's Speech

There are high variability in children's speech compared to adults' which is mainly because of their shorter vocal tract length and smaller vocal fold which results in lower accuracy in speech recognition task (about 54.5% in this work). Therefore using adaptation techinques which reduce these variabilities has been suggested. In this paper we focused on the problem of speech recognition for Pe...

متن کامل

HMM adaptation using linear spline interpolation with integrated spline parameter training for robust speech recognition

We recently proposed a method for HMM adaptation to noisy environments called Linear Spline Interpolation (LSI). LSI uses linear spline regression to model the relationship between clean and noisy speech features. In the original algorithm, stereo training data was used to learn the spline parameters that minimize the error between the predicted and actual noisy speech features. The estimated s...

متن کامل

Improving DNN-Based Automatic Recognition of Non-native Children's Speech with Adult Speech

Acoustic models for state-of-the-art DNN-based speech recognition systems are typically trained using at least several hundred hours of task-specific training data. However, this amount of training data is not always available for some applications. In this paper, we investigate how to use an adult speech corpus to improve DNN-based automatic speech recognition for non-native children's speech....

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003